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Many real-world networks have broad degree distributions. For some systems, this means that the functional 
significance of the vertices is also broadly distributed, in other cases the vertices are equally significant, but in 
different ways. One example of the latter case is metabolic networks, where the high-degree vertices — the cur- 
rency metabolites — supply the molecular groups to the low-degree metabolites, and the latter are responsible 
for the higher-order biological function, of vital importance to the organism. In this paper, we propose a gen- 
eralization of currency metabolites to currency vertices. We investigate the network structural characteristics of 
such systems, both in model networks and in some empirical systems. In addition to metabolic networks, we find 
that a network of music collaborations and a network of e-mail exchange could be described by a division of the 
vertices into currency vertices and others. 



I. INTRODUCTION 

Over the last decade methods from statistical physics have 
contributed greatly to the theory of complex networks (Ql 0; 
3). One of the major contributions is the development of meth- 
ods to characterize and categorize the vertices (nodes) of real- 
world networks. Numerous networked systems are heteroge- 
neous in the sense that a majority of vertices have a degree 
lower than the average, whereas a small number of vertices 
have a much higher degree than the average. For many such 
systems, one can relate the degree of a vertex to its function. 
In, for example, the network of air flights (4) the central ver- 
tices are the largest airports. These are the hubs international 
travellers hardly can avoid and arguably the most important 
facilities for the function of global air transportation. Degree, 
and other centrality measures dH 0), are therefore static mea- 
sures of the importance of airports to the dynamic function of 
the system. However, there are other networked systems with 
broad degree distributions where this description is incom- 
plete. Metabolism is the set of chemical reactions occurring in 
a normally functioning organism. From such a reaction sys- 
tem, one can construct networks of chemical substances (|7|). 
Such networks have heterogeneous degree distributions. The 
hubs of metabolic networks are the most abundant molecules, 
such as CO2 and H2O. These metabolites have very differ- 
ent functions compared to the low-degree vertices — they are 
present throughout the cell and participate in reactions of all 
kinds of complexity. By analogy to money, frequently chang- 
ing hands, the hubs of metabolic networks are called currency 
metabolites. For the overall function of the system — to de- 
velop and maintain high-level biological functionality, and ul- 
timately life — low-degree vertices are also essential. Al- 
though the hubs may affect the organism's health, on aver- 
age, more than the peripheral vertices, most authors agree that 
usingdegree as a proxy of functional importance is mislead- 
ing Q B H; OH M, Qj; O Q3). Instead, the picture often 
painted is that the higher functionality, and thus the most inter- 
esting information for questions of current scientific interest 
(related to evolution and metabolic diseases), is contained in 
the organization of the non-currency metabolites. For this rea- 
son, to achieve a network that is more informative, currency 
metabolites are often deleted Q S H [H H H E3). 



Another characteristic property of metabolic networks is that 
the non-currency metabolites form network clusters that are 
more connected within, than between each other. This mod- 
ular structure, one believes, is related to the function of the 
network — a network cluster (network module) is responsible 
for one relatively well-defined task in the metabolic system. 
The currency metabolites, on the other hand, are involved in 
the production of a wide variety of molecules, from many dif- 
ferent modules. Thus the currency metabolites hide the mod- 
ular network structure, something that can be used for a graph 
based definition of currency metabolites (8). If vertices are 
deleted from the network in order of highest degree, then the 
set of currency metabolites is the set of vertices that, if deleted, 
gives the highest relative modularity. (Where "relative mod- 
ularity" is a measure quantifying the tendency of the network 
to be organized in network modules, and is defined mathemat- 
ically below.) 

In this paper, we pursue the idea that the description of 
metabolic networks above — that the bulk of the dynamics are 
performed by currency metabolites, and the higher order func- 
tion is produced in the network modules by the low-degree 
vertices — also is relevant for some other networked sys- 
tems. Consider the network of people present at the venue of 
a larger scientific meeting, where two persons are linked with 
each other if they have engaged in a conversation. Probably 
most scientists have links to the people at the reception desk, 
and links to their collaborators and other scientists working 
on similar problems. The functional output of the conference 
— the advancement of science — would then be performed 
in the network clusters of people with similar interests. The 
receptionists, the currency vertices, are nevertheless impor- 
tant for the meeting to be successful, but in a different way 
than the other vertices. The modular structure of the scientists 
would be more visible if the receptionists were not included 
in the network. (Similar descriptions of social networks can 
be found in Refs. (BEl).) 

Whether or not a network is well described by a dichotomy 
of the vertices into currency and non-currency vertices is ulti- 
mately a question about the whole system, including dynamic 
processes on the network. Nevertheless, as mentioned above, 
one can define currency metabolites for any network. Since 
there is no general, functional definition of currency vertices 



2 




FIG. 1 Example output of the network model. Model parameters 
are g = 4, n g = 10, n c = 4, p g = 0.4, p a = 0.04, and p c = 0.4. 
In (a) the clear modular structure of the network without the model 
currency vertices (MCV) is shown. In (b), we also display the MCVs 
obscuring the modular structure. 



one cannot evaluate the definition directly. We will perform 
an indirect validation by creating a model producing networks 
where the network characteristics of currency metabolites can 
be tuned continuously. Using this model, we investigate the 
parameter values where the designated currency vertices of 
the model match the identified currency vertices. By mapping 
out the network structure of the region in parameter space 
where the matching is good, one can get an indication if a 
network fits to the currency-vertex picture. We will also use 
a more direct validation for nine different types of empirical 
network — we derive model parameter values from the net- 
works and calculate the matching scores as for the model net- 
works, a high matching will be interpreted as a support for the 
currency- vertex picture. 

The rest of the paper is organized as follows. First, we 
define network modularity and currency vertices mathemat- 
ically. Then, we define the network model and, finally, evalu- 
ate the currency vertices of the model and empirical networks. 



A. Network modularity and currency vertices 

In this section, we will discuss how to calculate network 
modularity. For a more detailed account, see Ref. (17). Con- 
sider a partition of the vertex set into groups, and let denote 
the fraction of edges between groups i and j. The network 
modularity of this partition is defined as( 18) 



(1) 



where the sum is over all groups of vertices. The term 

\Tij e ij) i s tr, e expectation value of e„ in a random multi- 
graph. A prototype measure for the modularity of a graph is 
Q maximized over all partitions, Q. For many networks with 
broad degree distributions, it is common to measure network 
structure relative to a null-model of random graphs with the 
constraint that the set of degrees is the same as in G, G(G). 
In principle this means that one separates degree from other 
network structures, which is appropriate in our case — in fact, 
this idea is implicit in the definition of currency vertices. With 
this null model, we subtract the average Q-value for graphs in 
Q{G) from Q(G): 



A(G) = Q(G) - <<2(G')W(Q> 



(2) 



where angular brackets denote average over G(G)(d). We use 
a random rewiring of the original graph to sample G(G) il% . 
and the heuristics proposed in Ref. (u7T) to maximize Q. 

To extract the currency vertices we start with the original 
graph Go and perform the following scheme 

1 . Measure Q(G,), where i is the number of times this line 
has been executed before this time. 

2. Delete the vertex with highest degree from G, and call 
this graph G, + i. 

3. Make a copy, G' i+V of G;+i. 

4. Rewire the edges of G' M and measure Q(G' j+l ). Repeat 
this n; ter times and calculate (Q(G'))ceg(o- 

5. If A(G,) is lower than A(Go), or if i = N - 1, then break 
the iterations. 

The vertices deleted at step [2] maximizing A(G,) is the set 
of currency vertices. In this paper, we use nj ter = 25. A 
C-implementation of this algorithm can be downloaded at 
www . esc . kth . se/ ~pholme/curr/ 



II. PRELIMINARIES 

In this paper, we consider networks modelled as graphs G = 
(V, E) where V is the set of N vertices and E are the M edges 
(unordered pairs of vertices). We assume the graphs to be 
simple, i.e. that they do not have multiple edges or self-edges. 
(Graphs that are not simple are called multigraphs.) 



B. Artificial networks 

To investigate the definition of currency vertices, as 
sketched in the Introduction, we use model networks where 
one can tune the strength of modularity, number of currency 
vertices and average degrees. 

Let there be g groups (corresponding to network modules), 
n g vertices within each group, and n c model currency vertices 
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(MCV). Then go through all pairs of distinct non-MCVs and 
connect these with probability p g if they belong to the same 
group, and p a otherwise. Finally, go through all pairs of ver- 
tices containing at least one currency vertex and connect the 
pair with a probability p c . 

The expected number of vertices is 



N - n c + grig 
and the expected number of edges 



M 



p g gn g (n g - 1) + Pognlig - 1) + p c n c (N - 1) 



(3) 



(4) 



The modularity Q for the model with n c — (or all MCVs 
removed), partitioned according to the groups, is 



Q=8 



Pgn g (n g - 1) 
2M 



f 2 

p s n„(n s - 1) p n s 

+ C*-i) ' 



2M 



2M 



In the limit g, n g » 1, Eq. [5]reduces to 

Q where y — — . 

1+8/7 8 Po 



(5) 



(6) 



Since our model produces simple graphs (and not multi- 
graphs, as the theory behind the definition of Q), putting 
Pg = p in Eq. [5] does only approximately give Q = 0. The 
error in this approximation is 0(l/n ? + l/g). The model can 
easily be modified to produce multigraphs (by just dropping 
the requirement of no self-edges or multiple edges), in which 
case the p g - p would indeed give zero modularity. 



C. Matching score 

As mentioned in the Introduction, we will investigate how 
well the original structure of the network matches the out- 
put of the currency-vertex detection algorithm as a function 
of model parameter values. The quantity for measuring the 
overlap of model groups and identified network clusters is the 
fraction of overlapping group identities in the best matching 
between the two classifications. In other words, let xi be ver- 
tex f s group in the original network (jc; e [1, •■• ,g], currency 
vertices are not counted as members of any group) and let y, 
be vertex i's identity obtained from the currency-vertex detec- 
tion (y; e [1, • • ■ ,N g ], Ng is the number of detected groups). 
Then find the labeling of the graph-clustering groups such that 
each group has a unique number in the interval [1, ■ • ■ ,N g ], 
and that the number n matc h of vertices i with xi = y,- is max- 
imized. Then we define the matching score p g - n matl ±lgng. 
We calculate n ma tch by a simple heuristic: 

1. Start with a random labeling of the groups. 

2. Select a pair of group labels. 

3. If «match does not decrease if these labels are swapped, 
then swap them. 

4. If no improvement has been made during the last n rep 
steps, go to step |2] 
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FIG. 2 The maximal relative modularity as a function of the ratio 
7 = Pg/po °f probabilities for attachment within a group. We chose 
n c = 10, g = rig and other parameter values such that the average 
degree is 181.1 for model currency vertices, and 14.5 for the others. 
The points are averages of 10 to 20 network realizations. 



5. Start over from step Q] with a new random seed unless a 
new highest w matc h has been found in step|4]the last N Kp 
time steps. 

In addition to measuring the matching of model groups and 
network clusters, we look at the matching between actual cur- 
rency vertices (identified by the algorithm), and the MCVs 
assigned in the model during the generation of the graph. In 
this case, we use the Jaccard index of the two sets of vertices: 



Wc n Vd 
\V c uVc\' 



(7) 



where V c is the set of detected currency vertices, Vc is the set 
of MCVs, and I • I denote the number of elements of a set. 



III. NUMERICAL RESULTS 

A. Artificial networks 

We start our numerical investigation by measuring the 
matching scores for networks of different modularity. As 
hinted from Eq. [5] the modularity can be controlled by the 
ratio of edges between vertices of the same, and different, 
groups y. The measurable modularity (i.e. the one that does 
not need the partition information from the network construc- 
tion) is (for fixed network sizes) monotonously increasing 
with y, see Fig. [2] This confirms the indication from Eq. [6] 
that y works as a control parameter for the relative modular- 
ity. We also see that the maximal value of the relative mod- 
ularity A depends on both the network size and y. This ef- 
fect is smaller if one let the degree increase with the num- 
ber of vertices (which as been observed in some classes of 
networks d20tl2ll) ). instead of keeping degree fixed as in Fig. [2] 
This also suggests that comparing the A-values of different 
networks should be done carefully. The comparison built into 
the currency-vertex definition algorithm concerns a sequence 
of monotonously shrinking networks from the same original. 
Since the size of the network do not change much during 
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maximal modularity, A 

FIG. 3 Matching scores for networks of different sizes, (a) shows the 
group matching scores p g . (b) displays the currency-vertex matching 
scores. The symbols and parameter values are the same as in Fig. [2] 
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FIG. 4 The number of network clusters N s as a function of the num- 
ber of groups g in the model (a), and the group matching score p g 
as a function of g (b). The other parameter values are p g = 0.2, 
p„ = 0.01, p c = 0.25 and n c = 10. Averages are over 20 network 
realizations. 
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FIG. 5 The number of currency vertices N c as a function of the 
number of model currency vertices n c (a), and the currency-vertex 
matching score p c as a function of n c (b). The other parameter values 
are p g = 0.2, p„ = 0.01, p c = 0.25 and g = 6. Averages are over 20 
network realizations. 



an iteration, and due to the smooth monotonous increase of 
Fig. |2] the shrinking size during the currency-vertex definition 
scheme is not a technical problem. 

For real-world networks the A (and not y) is a measurable 
quantity. In Fig. [3] we show the matching scores as function 
of maximal relative modularity A. The matching scores (both 
fig and yU t ) increase monotonously with A, meaning that the 
picture of regular vertices grouped into clusters (instantiated 
by the model) holds better the larger the relative modularity 
is. For the parameter values in question, A-values of ~ 0.2 
are needed for matching-score values over 0.5. For example, 
if one deems values of fi g and /u c less than 0.5 too small, then 
one can conclude that networks with A < 0.2 probably do not 
fit the currency- vertex description. We note that in Fig. [3] the 
matching scores for a given A-value seem to converge from 
above. If the /j-parameters {p g , p and p c ) are fixed as are 
changed, then this convergence goes in the opposite direction 
(the //-values grow with the system size). 

In Fig.|H we investigate how the number of network clus- 
ters Af g depends on the number of groups g in the model. For 
a small number of groups g m N g (as seen in Fig. Ufa)). In- 
deed, the identified clusters are almost the same as the original 
groups (jig a; 1 in Fig.|4|b)). For larger g, N g starts to deviate 
from g. This deviation appears later for larger network sizes, 
indicating that this is a finite size effect. The number of ver- 
tices sets a (trivial) upper bound of this matching. Fig. |4|a) 
shows that the bound increases slower than linear (possibly 
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logarithmically). In the light of this observation, if N g is too 
large (considering the network sizes), then the currency-vertex 
picture seems less appropriate. 

Fig. [5] illustrates the model's dependence of the number of 
MCVs, n c . Just like for the number of network clusters, the 
matching with the corresponding model parameters is largest 
for small values. For larger values of n c , the number of cur- 
rency vertices start to deviate (becoming lower than n c ). From 
both the N c - and /z c -curves, we note that matching score is 
larger for larger networks. The mismatch between the cur- 
rency vertices of the model network construction and the cur- 
rency vertices by definition is thus a finite-size effect. 



B. Empirical networks 

Now we turn to evaluate real-world networks. We per- 
form the identification of currency vertices as outlined above, 
and obtain a decomposition into network clusters of the non- 
currency vertices. From this we obtain values of N c , N g , Ao 
and A displayed in Table IIII.Al Furthermore, we calculate p c 
and fig for our model with parameter values derived from the 
network — we let g be the measured N g , set n c equal to N c , 
n g — (N - N c )/g (rounded to the lower integer) and, for p g , 
p and p c , use the fraction of edges between the respective 
types of vertices in the empirical network. By this procedure, 
we obtain matching scores giving some indication how appro- 
priate the currency-vertex picture is. One difference between 
the model and the empirical network is that the clusters of 
the model have the same sizes, whereas the cluster sizes of 
the real-world network varies. This is a feature that could 
affect the results quantitatively, especially if there is a wide 
distribution of cluster sizes. This is (fortunately for the analy- 
sis method) not the case. Even if the degree distributions are 
broad, the cluster size distribution is rather narrow — the h g /N 
values of Table lHI.AI are low, with the atmospheric network as 
an exception (the results for this network thus be taken with a 
grain of salt). 

Of the nine empirical networks, three networks do not have 
any currency vertices at all. These three are clearly disquali- 
fied for our currency-vertex picture. Of the six networks with 
N c > 0, three networks — a social network of music collabo- 
rations, a metabolic network and a network of e-mails — have 
larger p c - and /^-values than other networks. These networks 
fulfil the structural prerequisites for a currency- vertex picture. 
In the music collaboration network, we can assume the cur- 
rency vertices are studio musicians that are not strongly af- 
filiated with one group, or orchestra, but participate on many 
artists' recordings. The e-mail network does not include spam 
mailsd23l). so we assume the hubs are addresses that send, or 
receive, information of more general nature (cf. the example 
of the social interactions at a scientific meeting in the Intro- 
duction). We also note that this classification seems indepen- 
dent of the network sizes — of the three networks with large 
matching scores (and n c > 0), the collaboration network is 
comparatively small and dense, whereas the e-mail network 
is larger and sparser; also among the networks with low p- 
values, this observation holds (the protein interaction network 



is large and sparse, the neural network is denser and smaller). 
Furthermore, we note that the region of the network-structure 
space (for A, N g and N c ) giving large matching scores (as 
found in the previous section) is consistent with the observa- 
tions in Table IIII.Al Examples of networks falling outside of 
these ranges are the airport network (with a too large A^,-value 
considering its size), and the neural network (having too many 
currency vertices for its size to have a good matching). 



IV. CONCLUSIONS 

In this paper, we have extended a organizational principle, 
known in metabolic networks, to networks in general. In this 
picture, most vertices are of relatively low-degree, grouped 
into relatively distinct network clusters. A small minority of 
the vertices, however, have much larger degree than the aver- 
age, are linked to vertices of all clusters, and thereby obscure 
the modular organization of the low-degree vertices. We call 
these currency vertices. In a functional interpretation of this 
picture, the currency vertices perform the bulk of the dynam- 
ics, whereas the more specialized (and not necessarily less im- 
portant) features of the system occur in the modules. 

By just measuring the modular structure of a network, one 
cannot validate the currency-vertex definition. Instead of a 
direct validation, one can assume the network itself is an 
encoding of the functions of the vertices, and the currency- 
vertex definition is a decoding of this information) 17; 29; 30). 
Following this philosophy, we create a model with a tunable 
number of currency vertices, number of network clusters and 
strength of these features. The match between the encoded 
and decoded sets of currency vertices and network clusters 
are closest if the modularity is large, and numbers of currency 
vertices and network clusters are low. Using this procedure, 
we also evaluate empirical networks. We conclude that three 
of nine investigated networks fit rather well to the currency- 
vertex picture. The first of these networks is a network of 
collaborations between music artists, where we assume the 
currency vertices are studio musicians and the other vertices 
are group, or band, members (and the network clusters are 
the music groups). Our second example of a network with 
currency-vertex structure is a metabolic network — appropri- 
ate, since this class of networks is the inspiration of the con- 
cept. The third network potentially fitting our picture is an 
e-mail network, where we interpret the currency vertices as 
senders, or receivers, of general content e-mails (since the e- 
mails are sampled from a group of university e-mail accounts, 
such e-mails could be information to and from the univer- 
sity administration). The dialogues between colleagues and 
classmates presumably take place within the network clus- 
ters. These dialogs correspond to a different type of informa- 
tion process than the e-mails to the hubs, just as the function 
of currency metabolites is different from other substances in 
metabolic networks and the hubs of the music collaboration 
network have different roles than the majority of musicians. 
Among the networks not fitting the picture of currency ver- 
tices are a social network of dolphins (with a clear modular 
structure, but no currency metabolites), a network of airports 
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network 


Ref. 


2V 


M 


N c 


N g 


hg/N 


A 


A 




He 


music collaborations 


(22) 


198 


2256 


16 


5 


0273 


0.261 


0.318 


0.98(1) 


0.81(7) 


metabolic 


(7) 


473 


1694 


4 


13 


0.214 


0.303 


0.349 


0.80(2) 


0.68(9) 


e-mail 


(23) 


1133 


5161 


10 


15 


0.286 


0.189 


0.247 


0.44(6) 


0.45(9) 


protein interaction 


(24) 


4168 


7434 


13 


41 


0.108 


0.080 


0.099 


0.05(5) 


0.07(4) 


airport network 


(25) 


456 


2799 


29 


24 


0.283 


0.128 


0.184 


0.05(3) 


0.065(2) 


neural network 


(2£) 


280 


1973 


32 


6 


0.257 


0.186 


0.232 


0.29(4) 


0.02(1) 


dolphin social network 


(27) 


62 


159 





4 


0.339 


0.166 


0.166 


0.85(2) 




atmospheric 


(2D 


249 


1197 





4 


0.518 


0.122 


0.122 


0.33(1) 




software dependence 


(22) 


1033 


1718 





29 


0.181 


0.148 


0.148 


0.19(1) 





TABLE I Values (network sizes, number of currency vertices N c , number of network clusters N g , relative size of the largest cluster h g /N, 
relative modularity A of the original network, maximal relative modularity A, group matching score fi g , currency-vertex matching score p c ) 
for empirical networks. In the music collaboration network, vertices are jazz musicians, connected if they have appeared on the same recording. 
In the metabolic and atmospheric networks the vertices are chemical substances and edges represent pairs of substances participating in the 
same reaction. The metabolic data comes from reactions in the bacterium Mycoplasma genitalium and the atmospheric data regards Earth. In 
the e-mail network, vertices are e-mail addresses and edges mean that at least one e-mail within the three month sampling period has been 
sent from one address to the other. The protein interaction network consists of proteins connected if they can bind physically to one another. 
Vertices in the neural network are neuronal cells of the nematode Caenorhabditis elegans, and edges indicate how these are connected. In the 
airport network, vertices are North American airports and edges pairs of airports with a regular nonstop flight. The dolphin social network is 
based on observed interactions between bottlenose dolphins in Doubtful Sound, New Zealand. In the software dependence data, a vertex is a 
software package and a link indicate that one package requires another package to be installed to function. Some of these network datasets are 
originally directed (the neuronal and e-mail networks are also weighted). These are transformed into simple graphs by reciprocating directed 
edges and treating any non-zero, weighed edge as an unweighted edge. The table is ordered primarily according to the ^-values, secondarily 
after the ^ s -values. The numbers in parentheses are standard errors in units of the last decimal. 



and a network derived from chemical reactions in the Earth's 
atmosphere. 

We have described the currency metabolite picture as a di- 
chotomous property — networks either fit it, or not. This is 
just a simplification and one may argue that the hubs of e.g. the 
airport networks (if we for a moment ignore that our airport 
network did not pass our tests) share some of the character- 
istics of currency vertices in other networks. At least, larger 
airports have a larger fraction of transfer passengers, and thus 
a somewhat different function in the entire dynamic system 
of air travel. This also illustrates that, to determine how well 
characterized a network is by a division of the vertices into 
currency vertices and others, one needs to (in addition to the 
analysis presented in this paper) consider the dynamics of the 
subject system. 
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